A B C D E F G H I J K L M N O P Q R S T U V W X Y Z All
Anny Leema, A.
- Web Forum Crawling Using Index Thread Page Flipping Algorithm
Authors
1 Department of Computer Applications, B.S. Abdur Rahman University, Chennai, Tamil Nadu, IN
Source
International Journal of Business Analytics and Intelligence, Vol 2, No 1 (2014), Pagination: 36-40Abstract
Internet forums are important platforms where users can send request and exchange information from different sources. The issue in existing system is the URL type recognition problem which consists of duplicate links and uninformative pages. Index Thread Page Flipping Algorithm (ITF) is used to overcome this issue. URL layout and page layout are used to recognise whether the URL link is valid or invalid.
In this project (Phase-I), "Web Forum Crawling using Index Thread Page Flipping Algorithm" is provided that finds whether the links are valid or invalid. The goal is to crawl relevant content. The Internet forums will have the URL type recognition problem. It learns to get the correct path or URL by using regular expression patterns and with created training sets from page type classifiers.
The modules implemented are user interface design module, page flipping module, entry URL discovery module, index/thread URL detection module, generic crawler module. In the user interface design module to connect with a server, user must give their user name and password. In the page flipping module, a long forum is divided into more pages which are linked by page-flipping links.Generic crawlers process each page individually and ignore the relationships between such pages. In the entry URL discovery module entry URL should be specified to perform the process. Some rules are defined to find the entry URL. In the index and thread URL detection module, index URL and thread URL are identified by their URL pattern. In the generic crawler module, given a forum it enters into the thread page and it performs crawling where it avoids the duplicate links and page flipping links.
The front end for all the modules in the project (Phase-I) is designed using eclipse and the backend is designed using SQL server 2005. The two modules in the project (Phase-I) are implemented using Java Servlet, JSP and the code behind is written using Java. The main feature of this project (Phase-I) is to save the bandwidth and time.
Keywords
Forum Crawling, Index Url, Thread Url, Page Flipping Url.References
- Cai, R., Yang, J. M., Lai, W., Wang, Y., & Zhang, L. (2008). iRobot: An intelligent crawler for web forums. Proceedings of the 17th International Conference on World Wide Web (pp. 447-456).
- Dasgupta, A., Kumar, R., & Sasturkar, A. (2008). Deduping URLs via rewrite rules. Proceedings of 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 186-194).
- Manku, G. S. , Jain, A., & Sarma, A. D. (2009). Detecting near duplicates for web crawling.
- Gao, C., Wang, L.,. Lin, C. Y., & Song, Y. I. Finding question- answer pairs from online forums. Proceedings of 31st Annual International ACM SIGIR Conference Research and Development in Information Retrieval (pp. 467-474).
- Guo, Y., Li, K., Zhang, K., & Zhang, G. (2006). Board forum crawling: A web crawling method for web forum. Proceedings of 2006 IEEE/WIC/ACM International Conference on Web Intelligence (pp. 475-478).
- Henzinger, M. (2006). Finding near-duplicate Web pages: a large-scale evaluation of algorithms. Proceedings of 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 284-29)1.
- Koppula, H. S., Leela, K. P. , Agarwal, A., Chitrapura, K. P., Garg, S., & Sasturkar, A. (2010). Learning URL patterns for webpage de-duplication. Proceedings of the 3rd ACM Conference on Web Search and Data Mining (pp. 381-390).
- Schonfeld, U., & Shivakumar, N. (2009). Sitemaps: Above and Beyond the Crawl of Duty. Proceedings of 18th International Conference World Wide Web (pp. 991-1000).
- Yang, J. M., Cai, R., Wang, Y., Zhu, J., Zhang, L., & Ma, W.Y. (2009). Incorporating Site-Level Knowledge to Extract Structured data from web forums. Proceeding of the 18th International Conference On World Wide Web, 181-190.
- Zhang, L., Liu, B., Lim, S. H., & O'Brien-Strain, E. (2010). Extracting and Ranking Product Features in Opinion Documents.
- Real-Time Event Detection and Earthquake Reporting System for the Twitter Users
Authors
1 Department of Computer Applications, B. S. Abdur Rahman University, Chennai, IN
2 Department of Computer Applications, B.S. Abdur Rahman University, Chennai, IN
Source
International Journal of Business Analytics and Intelligence, Vol 2, No 1 (2014), Pagination: 41-45Abstract
Twitter is a popular micro blogging service and its important characteristic is its being real-time in nature. The earthquakes event in twitter is a real-time interaction and it's developed by particle filter algorithm to monitor msg and to detect a target event. Particle filter algorithm is used to find out location of an event. JMA broadcast technique has been used to distributes the warning information to stations and mobile phone companies, before the occurrence of earthquakes. JMA broadcast announcements consume more time to broadcast earthquake early warning informations among the advanced users world.
Twitter is a popular micro blogging service and its important characteristic is its real-time nature. In the proposed system, the real-time interaction of events are investigated such as earthquakes in twitter, system detects earthquakes promptly and notification is delivered much faster than JMA broadcast announcement. In this paper considering each twitter user as a sensor, particle filtering algorithm is applied which is widely used for location estimation in ubiquitous computing.
Keywords
Twitter, Micro Blogging Service, Event Detection, Earthquake, Location Estimation and Social Sensor, Particle Filter, Semantic Analysis.References
- Passant, A., Hastrup, T., & Bojars, U. (2008). Microblogging: A semantic web and distributed approach. In Proceedings of Fourth Workshop Scripting for the Semantic Web. http://data.semanticweb.org/workshop/scripting/2008/paper/11.
- Huberman, B. A., Romero, D. M., & Wu, F. (2008). Social networks that matter: Twitter underthe microscope.
- Borau, K., Ullrich, C., Feng, J., & Shen, R. (2009). Microblogging for Language Learning: Using Twitter to Train Communicative and Cultural Competence.
- O'Connor, B., Balasubramanyan, R., Routledge, B. R., & Smith, N. A. From Tweets to Polls: Linking Text Sentiment to Public Opinion Time Series.
- Sakaki, T., Okazaki, M., & Matsua, Y. (2010). Earthquake shakes twitter users: real-time event detection by social sensors. Proceedings of the 19th international conference on world wide web, 851-860.
- SC-TPDP Protocol to secure Multi-Cloud Storage from XSS Attacks
Authors
1 Department of MCA, Ethiraj College for Women, Chennai - 600008, Tamil Nadu, IN
2 Department of MCA, B. S. Abdur Rahman University, Chennai - 600048, Tamil Nadu, IN
Source
Indian Journal of Science and Technology, Vol 9, No 19 (2016), Pagination:Abstract
Cloud computing are in demand for various technical needs. The way the data processing is done to reach cloud servers and the way of making the service protective and free form XSS Attack determines the success. The objective of this study is to design a method which can moderate XSS attacks. Methodology: The integrity testing protocol must be efficient in order to save the cost. From these two points, the proposed secure cloud transmission protocol SC-TPDP is developed and designed which moderates XSS attacks. This framework design facts will help in increasing a secure protocol for the customers who are using cloud computing technologies over unconfident internet. Remote data reliability testing model: SC-TPDP (Secure Cloud Transmission provable data Possession) in multi-cloud storage. Based on the bilinear pairings, a SC-TPDP protocol is designed. Findings/Improvements: The proposed SC-TPDP protocol is provably secure under the efficient of the Blowfish Algorithm. In addition to the advantage of removal of certificate management, the SC-TPDP protocol is also efficient and flexible. Based on the client's authorization, the proposed SC-TPDP protocol can recognize private verification, delegated verification, public verification and tracking attackers/hackers from client authorization. Application: The SC-TPDP protocol focuses on securing cloud provable data possession in multi-cloud storage from XSS attack. This protocol is secure and minimizes the attacker's entry, thus protecting cloud end user from XSS attack.Keywords
Blowfish Cryptography, Cloud Computing, Hackers Protection, Provable Data Possession, SC-TPDP, TOTP, XSS Attack Protection.- Sentiment Mining from Online Patient Experience using Latent Dirichlet Allocation Method
Authors
1 Department of Computer Applications, B. S. Abdur Rahman University, Chennai - 600048, Tamil Nadu, IN
Source
Indian Journal of Science and Technology, Vol 9, No 19 (2016), Pagination:Abstract
Background/Objectives: The paper is focusing on the problem of mining sentiments in the health care text. The attempt here is to apply sentiment analysis technique to extract the feelings of patients with various emotion labels like happiness, sadness and surprise about the healthcare. Methods/Statistical Analysis: In this paper the connectivity between social emotions and affective terms are predicted from the patient experience automatically using a joint emotion-topic model by augmenting Latent Dirichlet Allocation (LDA) along with a layer for emotion modeling. The following six modules like Preprocessing, Topic Generation, Polarity Classification, Sentiment Classification, Sentiment Analysis and Aspect Ranking are identified in our system. The set of latent topics is generated from emotions initially. From each of the latent topic affective terms are generated. Finally K-means clustering is applied to detect the emotion. Using aspect ranking technique the weightage of the document is calculated. Findings: An intricate description about sentiments reflected in the reviews of patient experience is not provided by many of the sentiment prediction approaches. Experimental results proved that the meaningful latent topics for each emotion are successfully identified by the proposed model. The identified emotions are useful to categorize the document and assist the online users to select required healthcare based on their emotional preferences. Application/Improvements: The machine learning process is able to make a careful determination of patient opinion about the various administration aspects of a hospital based on the prediction accuracy that have been achieved. Various machine learning predictions are correlated with results of more conventional surveys. It will be interesting to generate more efficient algorithms based on topic models in several other opinion mining systems and for large-scale data sets.Keywords
K-means, LDA, Patient Experience, Sentiment Analysis, Topic generation.- Proliferation of E-Learning in Indian Universities through the Analysis of Existing LMS Scenario: A Novel Approach
Authors
1 Department of Computer Applications, B.S. Abdur Rahman University, Vandalur, Chennai 600048, Tamil Nadu, IN